home

Editorial
Today's News
News Archives
On-line Articles
Current Issue
Current Abstracts
Magazine Archives
Subscribe to ISD


Directories:
Vendor Guide 2001
Advertiser Index
EDA Web Directory
Literature Guide
Event Calendar


Resources:
Resources and Seminars
Special Sections
High-tech Job Search


Information:
2001 Media Kit
About isdmag.com
Writers Wanted!
Search isdmag.com
Contact Us





HW/SW Co-Verification Tools Make Some Progress

By William Mackenzie and Janet Demaine


The argument for true HW/SW co-verification has grown increasingly persuasive over the past few years. As semiconductor manufacturers migrate to deep submicron (DSM) process technologies, the skyrocketing non-recurring engineering (NRE) costs associated with ASIC development has increased the pressure on designers. Re-spinning an ASIC is quickly disappearing as an option. Even writing software patches to fix bugs can become extremely expensive in today's development environment.

  1. Software Execution Environment-Defining where the computation specified by the system software will be executed. This environment can take many forms including a program compiled for the host workstation or an instruction set simulator (ISS) for the target processor. In addition the software can be compiled and run in a host compiled RTOS emulator such as Vx-sim from WindRiver Systems, Inc. (Alameda, CA)

  2. Logic Simulation Environment-This environment holds the executable model of the hardware as specified in HDLs and provides a venue where the model can demonstrate a response to applied stimulus. Typical options include an event-driven simulator, a cycle-based simulator or a logic emulation system.

  3. Software Debugger-This tool supplies a window into the executing software. Designers use the debugger to observe and control the software. Typically it is closely tied to the software execution environment and plays a critical role in software development. Popular software debuggers include Berkeley's DBX and GNU's GDB.

  4. Hardware Debugger-On the hardware side, designers use this tool to observe and control the executing hardware. The hardware debugger allows the designer to visualize the hardware model as stimulus is applied to it. One example of a hardware debugger for Verilog simulators is Synopsys' Vir-sim.

  5. Integration Mechanisms-Mechanisms for integrating the software execution and logic simulation can be generally divided into two groups: spatial and temporal. Spatial mechanisms define those places where the hardware and software meet. They are usually built around a map of an address space accessed by the software and the buses used to carry out those hardware accesses. Temporal mechanisms focus on the proper synchronization of the software execution and the logic simulation environments. They typically ensure that the hardware is synchronized with the software when specific hardware addresses are accessed and that the software synchronizes with the hardware during certain asynchronous hardware events.

As a result, the traditional approach to HW/SW product-design flows-where software development is postponed until hardware breadboards or prototypes are developed-is quickly becoming outdated. That approach may make sense when IC options are relatively limited and software plays a less influential role in end-product functionality. But as faster processors have allowed designers to embed more of a product's functionality in firmware, and those functions have evolved from non-real-time to real-time tasks, the opportunities for bugs to arise in the HW/SW interface have grown exponentially.

As embedded systems designs have evolved, so too have the requirements for true HW/SW co-verification. Today embedded designers must verify three distinct software components against their hardware: the application code, the hardware drivers, and the system diagnostic code. Moreover, as embedded systems have become increasingly multi-functional, designers have commonly built their designs around a real-time operating system (RTOS). To deal with the increased software complexity issues while still meeting stringent time-to-market requirements, software teams are considering software intellectual property (IP) reuse methodologies. This software IP may be legacy code from previous projects, or third party software IP.

Most of the HW/SW co-verification tools available to date have offered limited solutions. Typically, they have focused on a specific piece of the problem. For example, traditional co-verification solutions might verify the system diagnostics. Prior to silicon availability, this step can shorten the development cycle, however, interface bugs in hardware drivers or application code-both with the hardware itself or the RTOS-can easily offset these advantages by forcing designers to spend precious time re-spinning silicon or rewriting software. Few co-verification tools available today address the growing need to verify hardware and software in the context of an RTOS.

Traditional approach

At Enterasys Networks (Rochester, NH), a network equipment solutions provider, we faced these same issues as we set out to design the Matrix E7, a new Ethernet switch for wiring closets in enterprise applications. As on many design teams, we relied heavily on exercising a prototype in the lab to pinpoint and resolve bugs in their design. We'd rewrite some diagnostics in Verilog so that the design team could run them in the simulation environment. That allowed firmware developers to get a feel for how the design would perform and gave the hardware designers a sense of how the software developers needed the hardware to operate.

With most of our new products, the firmware and applications codes were only tested after a prototype was built and the hardware design shaken out. Typically, designers would load the code on the prototype, run diagnostics or code that closely resembled the end-customer's actual application and, only at that point, start to get a sense of how the pieces of the overall design worked together. Of course at this juncture we'd find some bugs due to misunderstandings or misinterpretations of some aspect of the design by someone on the hardware or firmware team. Once a bug was discovered, the designer would have to pull out a logic analyzer and other debug tools for a process that could take a week or more to diagnose depending on the problem.

The sheer complexity of the Matrix E7 design prompted us to take a different approach. The design team's goals were ambitious. The Matrix E7 marked the first of a new generation of intelligent network switches designed to support applications ranging from enterprise data centers and high performance wiring closets to remote offices. To achieve that goal, the switch would need a highly scaleable architecture capable of supporting physical interfaces up to 10 GB in total link bandwidth. If such a product were to have comprehensive service and security capabilities as well as guarantee delivery of heavy traffic, it would require a distributed switching architecture, a passive high-speed switch fabric, and redundant power.

Included among our performance requirements was the ability to support more than 500 10/100 Ethernet ports and more than 80 GB Ethernet ports from a single chassis. For bandwidth-intensive applications, aggregate switching throughput had to exceed more than 100 million packets per second. One key goal was the integration of extensive intelligence on-board the switch. We decided that the addition of such advanced features as multi-layer filtering and a packet flow control method, called differential services support, would be needed to support the classification and control of network traffic. At the same time, designers needed to ensure the reliable delivery of applications such as voice-over-IP, streaming video, e-commerce transactions and ERP applications.

Of course, integrating such advanced features into a switch while keeping our cost goals in sight meant rolling multiple functions into a smaller set of components. We decided to roll traffic control features such as add-on cards and router modules into the basic product by embedding those services into multiple ASICs.

To verify hardware and software interaction prior to the availability of a full hardware prototype, we used Virtual-CPU (V-CPU), a co-verification environment offered by Innoveda (Marlboro, MA). The product combines a software execution environment that runs the embedded system software as if it were running on a target CPU, with a logic simulation environment that runs a representation of the embedded system hardware and responds to bus cycles as if they were initiated by the target CPU. The target processor, however, is replaced by the interaction of a bus-functional model (BFM) of the processor in the hardware environment and a virtual processor in the software environment. The software execution environment runs in either host-code execution mode in a workstation process or in target-code execution mode within an instruction set simulator (ISS).

How a co-verification environment integrates the software execution and logic simulation environments plays a pivotal role in its ability to offer a comprehensive and accurate simulation. V-CPU associates four primary mechanisms to accomplish this integration: a memory map, CPU resource and software functional models, a BFM, and a mechanism called implicit hardware access.

Finding bugs early

Typically, designers begin using V-CPU by modeling the embedded system hardware in either a hardware description language (HDL) or in C/C++. For the most part, Enterasys engineers used Verilog system software in V-CPU written in C, C++ or the assembly language of the target processor. It is then compiled or assembled to run as a workstation process or within the ISS. The software runs on the workstation and accesses user-defined address ranges via the bus-functional model that mimics the processor interface. Algorithms in the program are executed on the virtual processor. For this project, our software developers decided to use the GNU debugger (an open-source debugging tool), while our hardware developers relied on Synopsys' Vir-sim, a simulation debug and analysis tool.

The introduction of V-CPU cut time out of the Enterasys product development cycle in a variety of ways. The ability to simulate hardware drivers proved a major advantage. Simple miscommunications or oversights can easily translate into significant bugs in the design of hardware drivers. A discrepancy inside a document, the misinterpretation of a specification, or the designation of the wrong pointer are common errors that can easily lead to major delays in the development process. As a case in point, when one of our firmware developers was writing code for a memory I/O device on the Matrix E7, he thought the device had to perform a standard 32-bit access. But as he stepped through his diagnostic code in V-CPU, he discovered a failure. After reviewing the executed code using a GNU debugger and working with the hardware engineer responsible for the device, he realized the design called for a 32-byte burst read or write.

Another example occurred with the implementation of a key component in the high-performance switch design, a new Gigabit MAC chip that the design team had never used before. On previous projects, our hardware design engineers would have designed the chip into the board and, based on discussions with the team, firmware designers would develop the drivers. But testing of the drivers would have to wait until the prototype stage. Any bugs found at that point would typically take a week or more to resolve.

With V-CPU, we used the time between the development of the hardware design and the arrival of prototype boards to run simulations with the firmware. By bringing up the actual driver in V-CPU, they immediately discovered both a firmware and a hardware bug. In the lab, such errors would have cost us a week's time debugging the code. By addressing those issues at the simulation stage and debugging the code against the hardware simulation before the prototype board arrived, the team was able to ensure that the driver for the Gigabit MAC would run correctly the first time the board was brought up.

The simulation environment even saved the designer's time in solving relatively simpler problems, such as finding bugs that caused the external bus controller to hang. One such error would occur whenever the designers operated on external queues managed by both the hardware and firmware. When the firmware used a specific order of operation to access the queues that was not implemented correctly in hardware, all hardware accesses by firmware would seize up. By bringing up Vir-sim and tracing the execution back in time, the bug was found and resolved in less than two hours. That was achievable because we could, using the simulation environment, look back at all of the ASICs's internal signals. If we had been in the lab using an FPGA prototype, we would've had to set up the necessary debug equipment and create new FPGA code to bring the necessary signals out to external pins in a process that could easily take more than a day.

The ability to modify the co-verification environment to specific hardware requirements was another key concern for our design team. That issue became extremely important as the designers worked with the cache capabilities built into the Power PC 750, the embedded processor that lies at the heart of the Matrix E7.

One area of the areas of the design that the team was particularly interested in testing was the processor's ability through specific, cache maintenance assembly code, functions to designate which pieces of code or data to fetch next. The Power PC allowed us to assign a pointer to memory that instructs the processor to perform a burst access into cache while the processor is running code. These functions can also perform a cache line store and zero out a cache line.

The diagnostics within the Matrix E7 calls several of these cache maintenance functions, assembler instructions. To help us run these pieces of code before prototype, Innoveda engineers implemented a new set of application programming interface (APIs) within V-CPU to perform cache maintenance functions on the built in cache models provided with V-CPU. That allowed us to run the cache maintenance functions for the PowerPC 750 within the simulation, and extensively increased code and hardware coverage in the process. This implementation allowed some very detailed hardware specific check out to be performed while staying with host -compiled software execution.

We could only run high-level software applications in the context of an RTOS. As an example, Matrix E7 was designed from the outset to be managed by a command line interface (CLI) or single network management protocol (SNMP). Both the CLI and SNMP are highly dependent on the operating system, so naturally we wanted to simulate their performance before the prototype stage of development.

Having the RTOS in place also allowed designers to test out a portion of the path that allows the firmware to transmit packets onto the network. The performance of this path is highly dependent on the RTOS. Through simulation, we were able to assess early in the design cycle whether it made sense to modify the design in order to remove or mitigate those dependencies.

VxWorks and V-CPU interface

One obstacle for our engineers was the interface between V-CPU and Wind River Systems' (Alameda, CA) VxWorks real-time operating system (RTOS). We've found that most available co-verification tools offer limited ability to verify hardware and software in the context of an RTOS. Typically a user has to compile the RTOS code and run it in the co-verification tool's execution environment. In host mode, that would involve the unlikely prospect of porting the RTOS to the host workstation. And even if it were possible, any software components purchased from third-party suppliers, such as a protocol stack, would be difficult to modify to run in a host mode environment.

One alternative was to run the RTOS in target mode. But that would have entailed the time-consuming process of developing a board support package for the hardware, cross-compiling the RTOS, and loading it in an instruction set simulator (ISS). In addition, the use of an ISS would dramatically slow performance of the RTOS.

For our design team, the best solution for bridging the gap between VxWorks and the HW/SW co-verification tool was a host-compiled emulation environment developed for VxWorks by Wind River Systems called Vx-sim. An interface between Vx-sim and V-CPU called V-CPU RTOS Support Package for Vx-sim linked the VxWorks RTOS directly to the hardware simulator in V-CPU. This allowed us to debug device drivers in Wind River Systems' Tornado environment with Vx-sim while accessing a debugger connected to the Verilog simulation of the target hardware. Our software engineers had access to the complete Tornado development environment on the host workstation and were not required to make changes to the source code when making memory and register accesses to the simulated hardware. Hardware engineers could run a true system simulation using stimulus from the system software instead of contrived workbenches, and could record and playback bus cycles for debug and regression testing (see Figure 1).

The implicit hardware access feature of V-CPU made it possible to include software IP in the simulation environment. In Enterasys' case, we were purchasing our protocol stack code from a third party. It was not possible to have this third party modify the code with explicit calls whenever it needed to access a hardware register. Of course the protocol stack was also dependent on the VxWorks RTOS, which was running in host mode. Without the Implicit Hardware Access feature, the software team would have had to wait for the prototype stage to run their higher-level software applications that were dependent on the RTOS and the protocol stack.

While co-verification tools are only in their early stages of development, they are quickly becoming essential methodologies for developers of complex systems. With software playing such an important role in overall end-product functionality, a design team can no longer postpone software development until hardware breadboards or prototypes are available.

Developers must now put the pieces together simultaneously and in the process revise and re-partition their design to maximize performance and reliability. In the process, they can also save tremendous amounts of development time.

The final word

Ultimately, the introduction of a co-verification methodology into the Matrix E7 design cycle has not eliminated the prototype test stage. That remains an important part of the Enterasys product development cycle. But new simulation capabilities that allowed embedded software compiled for the host platform to access hardware without modification and the availability of a host-compiled simulation environment for the RTOS allowed our team to work with an integrated system simulation well before a full hardware implementation was available. By doing so, it gave us an opportunity to analyze the interaction between hardware and software and to detect and resolve bugs in interfaces, device drivers, third-party software IP and their own application software much earlier than on previous projects.


Janet Demaine is the principal ASIC design engineer in the Switch Group of Enterasys Networks, Inc (Rochester, NH). Previously, she worked as a LAN development engineer at Hewlett-Packard.

William MacKenzie is a senior firmware design engineer for Enterasys Networks. His primary responsibilities include architecture and design for next generation switch/router firmware.

To voice an opinion on this or any other article in Integrated System Design, please e-mail your comments to sdean@cmp.comd

Sponsor Links

All material on this site Copyright © 2000 CMP Media Inc. All rights reserved.